Green Destiny + mpiBLAST = Bioinfomagic
نویسنده
چکیده
This paper outlines how our highly efficient, power-aware supercomputer called Green Destiny and our open-source parallelization of BLAST called mpiBLAST combine to create a bit of “bioinfomagic.” Green Destiny, featured in The New York Times and winner of a 2003 R&D 100 Award, revolutionized high-performance computing by re-defining performance to focus on issues of efficiency, reliability, and availability. Green Destiny is a 240-processor supercomputer that operates at a peak rate of 240 billion floating-point operations per second (or 240 gigaflops) but fits in six square feet and sips as little as 3.2 kilowatts of power. Consequently, it does not require any special infrastructure to operate, i.e., no cooling, no raised floor, no air filtration, and no humidification control. These attributes resulted in interest from several pharmaceutical and bioinformatics institutions, which likewise did not have special infrastructure to house traditional supercomputing clusters. Subsequent interactions with these pharmaceutical and bioinformatics institutions led to the birth of mpiBLAST, an open-source parallelization of BLAST that achieves super-linear speed-up via a technique called database segmentation. Database segmentation allows each computing node to search a smaller portion of the database (one that fits entirely in memory), thus eliminating disk I/O and vastly improving performance. When used in concert with Green Destiny, we demonstrate that a 300-kB BLAST query that takes nearly one full day to complete on a traditional PC or workstation takes only minutes on Green Destiny.
منابع مشابه
Green Destiny and its Evolving Parts
Although the performance of supercomputers on our n-body cosmology code has improved by a factor of nearly 2000 since 1991, the performance per watt has only improved 300-fold and the performance per square foot only 65fold. Clearly, we are building less and less efficient supercomputers, thus resulting in the construction of new machines rooms and even entirely new buildings. Furthermore, as t...
متن کاملThe Design, Implementation, and Evaluation of mpiBLAST
mpiBLAST is an open-source parallelization of BLAST that achieves superlinear speed-up by segmenting a BLAST database and then having each node in a computational cluster search a unique portion of the database. Database segmentation permits each node to search a smaller portion of the database, eliminating disk I/O and vastly improving BLAST performance. Because database segmentation does not ...
متن کاملParaMEDIC: Parallel Metadata Environment for Distributed I/O and Computing
BLAST is a widely used software toolkit for genomic sequence search. mpiBLAST is a freely available, open-source parallelization of BLAST that uses database seg-mentation to allow different worker processors to search (in parallel) unique segments of the database. After searching , the workers write their output to a filesystem. While mpiBLAST has been shown to achieve high performance in clust...
متن کاملARGUS: Supercomputing in 1/10 Cubic Meter
We propose ARGUS, a high density, low power supercomputer built from an IXIA network analyzer chassis and load modules. The prototype is a diskless MPP scalable to 128 processors in a single 9U chassis. The entire system has a footprint of 1/4 meter2 (2.5 ft2), a volume of 0.09 meter3 (3.3 ft3) and maximum power consumption of less than 2200 watts. We compare and contrast the characteristics of...
متن کاملArchitectural Refactoring for Fast and Modular Bioinformatics Sequence Search
Bioinformaticists use the Basic Local Alignment Search Tool (BLAST) to characterize an unknown sequence by comparing it against a database of known sequences, thus detecting evolutionary relationships and biological properties. mpiBLAST is a widely-used, high-performance, opensource parallelization of BLAST that runs on a computer cluster delivering super-linear speedups. However, the Achilles ...
متن کامل